39 research outputs found

    Improved one-class SVM classifier for sounds classification

    No full text
    ©2007 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.International audienceThis paper proposes to apply optimized One-Class Support Vector Machines (1-SVMs) as a discriminative framework in order to address a specific audio classification problem. First, since SVM-based classifier with gaussian RBF kernel is sensitive to the kernel width, the width will be scaled in a distribution-dependent way permitting to avoid underfitting and over-fitting problems. Moreover, an advanced dissimilarity measure will be introduced. We illustrate the performance of these methods on an audio database containing environmental sounds that may be of great importance for surveillance and security applications. The experiments conducted on a multi-class problem show that by choosing adequately the SVM parameters, we can efficiently address a sounds classification problem characterized by complex real-world datasets

    Multiple-Image Fusion Encryption (MIFE) Using Discrete Cosine Transformation (DCT) and Pseudo Random Number Generators

    Get PDF
    This chapter proposes a new multiple-image encryption algorithm based on spectral fusion of watermarked images and new chaotic generators. Logistic-May (LM), May-Gaussian (MG), and Gaussian-Gompertz (GG) were used as chaotic generators for their good properties in order to correct the flaws of 1D chaotic maps (Logistic, May, Gaussian, Gompertz) when used individually. Firstly, the discrete cosine transformation (DCT) and the low-pass filter of appropriate sizes are used to combine the target watermarked images in the spectral domain in two different multiplex images. Secondly, each of the two images is concatenated into blocks of small size, which are mixed by changing their position following the order generated by a chaotic sequence from the Logistic-May system (LM). Finally, the fusion of both scrambled images is achieved by a nonlinear mathematical expression based on Cramer’s rule to obtain two hybrid encrypted images. Then, after the decryption step, the hidden message can be retrieved from the watermarked image without any loss. The security analysis and experimental simulations confirmed that the proposed algorithm has a good encryption performance; it can encrypt a large number of images combined with text, of different types while maintaining a reduced Mean Square Error (MSE) after decryption

    Speech Emotion Recognition in Acted and Spontaneous Context

    Get PDF
    AbstractLittle attention has been paid so far in the context in which databases used for the study of emotion through vocal channel are recorded. Thus, we propose and evaluate an emotion classification system focusing on the differences between acted and spontaneous emotional speech through the use of two different databases: SAVEE and IEMOCAP. For the purpose of this work, we have examined wavelet packet energy and entropy features applied to Mel, Bark and ERB scale applied with Hidden Markov Model (HMM) as classification system. Experimental results show that the proposed method is a feasible technique for emotion classification for both acted and spontaneous context, pointing out the performance difference of the system between the two contexts. The experimental results shows that ERB scale features gives better performance in comparison with other studied features with recognition accuracy of 78.75% for acted context and 50.06% for spontaneous context

    Application of Perceptual Filtering Models to Noisy Speech Signals Enhancement

    No full text
    This paper describes a new speech enhancement approach using perceptually based noise reduction. The proposed approach is based on the application of two perceptual filtering models to noisy speech signals: the gammatone and the gammachirp filter banks with nonlinear resolution according to the equivalent rectangular bandwidth (ERB) scale. The perceptual filtering gives a number of subbands that are individually spectral weighted and modified according to two different noise suppression rules. The importance of an accurate noise estimate is related to the reduction of the musical noise artifacts in the processed speech that appears after classic subtractive process. In this context, we use continuous noise estimation algorithms. The performance of the proposed approach is evaluated on speech signals corrupted by real-world noises. Using objective tests based on the perceptual quality PESQ score and the quality rating of signal distortion (SIG), noise distortion (BAK) and overall quality (OVRL), and subjective test based on the quality rating of automatic speech recognition (ASR), we demonstrate that our speech enhancement approach using filter banks modeling the human auditory system outperforms the conventional spectral modification algorithms to improve quality and intelligibility of the enhanced speech signal
    corecore